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1 Introduction 

Let (M, g) be a closed Riemannian manifold of dimension n, and denote the induced 
volume form by vol. We assume that vol is normalised, i.e., that J^^vol = 1. The 
group of diffeomorphisms of M is denoted Diff (M). The subgroup of volume preserving 
diffeomorphisms is denoted Diffvoi(M). Throughout the paper, the word "metric" always 
refers to "Riemannian metric". 

The study of geodesic equations on diffeomorphism groups was initiated by Arnold [2], 
who discovered that the Euler equations of an incompressible perfect fluid correspond 
to a geodesic equation on DiS^oi(M) with respect to a right invariant L^ metric. Since 
Arnold's discovery, it has been shown that many equations of mathematical physics can 
be put into this framework. (Such equations are often called Euler-Poincare equations or 
Euler-Arnold equations. See the monographs [3, 24, 20, 15] and the survey paper [34].) 

The subject of this paper concerns geodesic equations corresponding to a class of 
right invariant Riemannian metrics on Diff(M). The significance of these metrics is that 
they descend to the homogeneous space Diffvoi(M)\Diff(M) of right co-sets, naturally 
identified with the space Dens(M) of smooth probability densities. 

Riemannian metrics and geodesic equations on Dens(M) are of importance in opti- 
mal transport, probability theory, statistical mechanics, and quantum mechanics. The 
connection between geodesies on Diff(M) and Dens(M) was pointed out by Otto [29], 
who studied a non-invariant L^ metric on Diff (M) which descends to Dens(M). (In this 
setting, Dens(M) is identified with the homogeneous space Diff (M) /Diff voi(-M^) of left 
co-sets.) Remarkably, the corresponding metric on Dens(M) induces the L^ Wasserstein 
distance, and is therefore called the Wasserstein metric. Otto's observation implies that 
the L^ optimal mass transport problem, which belongs to the class of Monge-Kantorovich 
problems, can be interpreted as a geodesic problem on Dens(M) with respect to the 
Wasserstein metric (at least in the case of smooth densities). This follows from general 
facts about Riemannian submersions: (i) a minimal geodesic between two fibres must be 
horizontal, and (ii) a horizontal geodesic descends to a geodesic on the base. 



1 Introduction 

Another important metric on the space of probabihty densities is the Fisher metric 
(also called the Fisher-Rao, or entropy differential metric). Classically, the Fisher metric 
occurs as a finite dimensional metric on smooth statistical models (called statistical 
manifolds), and has fundamental role in the field of information geometry [11, 31, 5, 1]. 
Friedrich [12] realised that statistical manifolds can be interpreted as finite dimensional 
submanifolds of Dens(M), and that the Fisher metric on such statistical manifolds is 
the restriction of one and the same canonical Fisher metric on Dens(M). Furthermore, 
Friedrich showed that this metric has constant positive curvature. 

Khesin, Lenells, Misiolek, and Preston [19] introduced a right invariant degenerate H^ 
"metric" on Diff(M), which descends to the Fisher metric on Dens(M). (Dens(M) is 
here identified with the right co-sets Diffvoi(M)\Diff(M).) By taking Otto's point of 
view, the authors then regard the geodesic problem, with respect to the Fisher metric, as 
an optimal information transport problem, with respect to a degenerate cost function on 
Diff (M) induced by the H^ "metric". Since the cost function is degenerate, the solutions 
are not unique. 

Ideally, one would like to have a right invariant metric on Diff (M) which descends to the 
Fisher metric on Dens(M). It is remarked in [19] that examples of non-degenerate right 
invariant metrics on Diff (M) descending to Dens(M) are lacking. The main motivation 
for the work at hand is to construct such metrics, and thus complete the analogy between 
optimal mass transport and optimal information transport. 

Indeed, in this paper we introduce a 3-parameter family of right invariant metrics on 
Diff (M) descending to the Fisher metric, and we prove local existence and uniqueness of 
the geodesic equations. We also attain existence and uniqueness for the corresponding 
optimal information transport problem, which, in turn, implies a polar factorisation 
result for diffeomorphisms. This factorisation is analogous to the polar factorisation of 
vector valued maps on M", obtained by Brenier [4], and later generalised to manifolds by 
McCann [25]. Our factorisation result can be seen as an infinite dimensional version of 
the classical QR factorisation of matrices. 

The right reduced geodesic equations for the family of metrics, which we are now going 
to present, can be interpreted as higher dimensional generalisations of the //-Hunter- 
Saxton (/iHS) equation, introduced by Khesin, Lenells, and Misiolek [18] (also called 
/i-Camassa-Holm in [23]). /iHS is a simple model for liquid crystals under influence of 
an external magnetic field. 

Let X(M) denote the smooth vector fields and Q,^{M) the smooth A;-forms on M. 
Further, let 

3^{M) = \f e C^{M)- I Fvol = o| . 

Recall the differential d : Q'=(M) -^ n''+\M), and the co-differential 5 : n^{M) -^ 
Qj'-^^M). The Laplace-de Rham operator A. = -do5-5o(i restricted to d9}^'^{M) 
or 5^^~^^{M) is an isomorphism [32]. In particular, it is an isomorphism on 3"(M) = 
5Q}{M). Let b : X{M) -^ Q}{M) denote the flat map, also called the musical isomor- 
phism. Its inverse, the sharp map, is denoted jj. For u € X(M), we write n instead of 
\){u) and correspondingly for j|. 
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Consider the pseudo-differential operator A : X{M) — > fl^(M) defined by 

Au:= ([d + doA-^o5 + -f6oA-^od + a6od + /3do5){v}') (1) 



wliere a, (5 > and 7 G [0, 1] are parameters. We are interested in tlie integro-differential 
equation given by 

m + £tj?TT. + mdiv(n) = 0, m = Au, (2a) 

wliere £„ denotes the Lie derivative along u and m = ^. By a solution we mean a 
curve t I— > u{t) £ X{M) such that u fulfils equation (2a). The equation also admits the 
form 

which follows since 

Lu{m (8> vol) = (LuTn) (8) vol + m (8) div(n)vol = [LuTU + ?n,div(u)) (?) vol. 

The paper is organised as follows. In § 2 we show that equation (2) is a right reduced 
equation for geodesies on Diff(M), i.e., an Euler-Arnold equation. Local existence and 
uniqueness of the Cauchy problem is given in §3. In §4 we discuss characterisation and 
construction of right invariant and descending metrics, and we show that the family of 
metrics constructed in this paper descend to the Fisher metric. In § 5 we first consider an 
abstract geometric framework for right invariant optimal transport problems and polar 
factorisation. Then, in §5.1, we focus on the case of optimal information transport 
using the new metric, and we derive a polar factorisation result for H^ diffeomorphisms. 
Finally, we show in § 5.2 that the classical QR factorisation of matrices can be viewed as 
a polar factorisation corresponding to optimal transport of inner products on M". The 
relation to the Cholesky factorisation of symmetric matrices is also pointed out. 

Before this, we continue below with a derivation of yet another form of equation (2), 
based on the Hodge decomposition. This form reveals some structural properties and 
relation to other equations. 

1.1 Hodge components 

From the Helmholtz decomposition it follows that X(M) = X^oi{M) © grad(3"(M)), 
where Xvoi{M) denotes the divergence free vector fields. Hence, every u £ X{M) can be 
decomposed uniquely as n = ^-|-grad(/), with ^ S X^oi{M) and / G 9"(M). Notice that / 
is unique, since it is required to be normalised. This is an orthogonal decomposition with 
respect to the L^ inner product on X(M), which is given by 



{u,v)l2 = / g{u,v)vol. 
Jm 



'M 
For A;-forms, the Hodge decomposition is given by 
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where "K^^M) = {a G i7'^(M); Aa = 0} is the space of harmonic fc-forms. This decom- 
position is orthogonal with respect to the L^ inner product on il (M), which is given 

by 



(a, 6)^2 = / a f\-kh, 
Jm 

where • : 0^(M) —t' ri"~^(M) is the Hodge star map. Notice that {u,v)i2 = {u^,v^)i2. 
Let D^{M) = 'K^{M) 50.^+^ {M). Then D^{M) = {a & n^{M); 5a = 0} is the space 
of co-closed fc-forms. The relation between the Hodge decomposition and the Helmholtz 
decomposition is: 

X™i(M)^ = D\M), grad(J(M))^ = d^^{M). 

In other words, the musical isomorphism b : X{M) — t- Q^{M) is diagonal with respect 
to the Helmholtz and Hodge decompositions. The same holds for the pseudo differential 
operator A : X(M) -^ 9}{M). That is, 

AX™i(M) = D1(M), yigrad(J(M)) = dn^{M). 

From the Hodge decomposition we also obtain a finer decomposition 

X™i(M)=Xjc(M)eX™i,cx(M), 

where Xvoi,ex(-^) = 6Q'^{MY are the exact volume preserving vector fields, and Xj{(M) = 
'K^{M)* are the harmonic vector fields. A is also diagonal with respect to this finer 
decomposition. Indeed, the L? orthogonal projection operator R : i7^(M) — )■ Ji^^M) 
onto the harmonic part is given by 

ii = id -h d o A"^ oS + 5o A-^ o d, 

and the L^ orthogonal projection operator P : r2^(M) -^ D^(M) onto the co-closed part 
is given by 

P = id -F d o A-^ o 5. 

From the definition of A it follows that A = {^R -|- (1 — ^)P + a6 od + /3 do S) o\). Now, 
if /i G Xm(M) then 

Ah = {-fR + (1 - -f)P)h^ +a6 dh^ +/3d 6h^ = /i^ G :K^{M), 
' ^u ' 

if e G Xvoi,cx(M) then 

A^ = jR^^ +(1 - 7) P^^ +aSd^^ + /3d -5^'' = (1 - 7)^^ + a5d^^ G 5n'^{M), 

^b 

and 

yigrad(/) = (l-7)Pd/ + 7i?d/ + aMd/-/3dA/ = -/3dA/Gdfi°(M). 

^^ v ' 
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Thus, if we represent u = h + S,+ grad(/) by its unique "Helmholtz-Hodge components" 
{h,C,f) G Xjc(M) X jevoi,ex(M) X J{M), we have that 

A{h,C,f) = (/^^((l-7)id-aA)^^-;SA/) e Ji\M) x 5n'^{M) x3^{M). 

Since both ((1 - 7)id - aA) o b : Xvoi,ex(M) -^ 6n'^{M) and A : J(M) -^ J(M) are 
invertible operators, it follows that A is also invertible (see §3 for details). 
Our aim is now to write equation (2a) in terms of the Hodge components 

a:={-/R + {l--f)P-a6od){v}') G D^{M) 

and 

p:=Af = div(n) G J(M), 

corresponding to the partial Hodge decomposition Q,^{M) = D^(M) © d3"(M). In these 
variables m = a — /3dp, so equation (2a) becomes 

(T — /3 dp + £^(7 — f3dLuP + per — /3pdp = 

t 

a + Lucr + pa - ^d(p + LuP+ y) =0 

Notice, in general, that Lu(t + pa ^ D-'^(M) and that LuP+^ ^ 3"(M). Thus, in order to 
find the Hodge components, we have to introduce a Lagrangian multiplier p G C'^{M). 
We can always find a p such that ^m^^ + p^ + dp G D^(M), and such a p is uniquely 
determined up to a constant. Further, we can always determine the constant part of p 

2 

in such a way that L^p + ^ + § G 'J{M). Continuing from above 

t 
a + Lua + pa + dp-l3d{p + LuP + ^ + ^) =0. 

GDi(M) ^ V ' 

We now obtain equation (2a) in terms of the Hodge components as 



a + Lua + pa = -dp, a = {'yR + (1 - 7)id - aA) {Pv! 

p + ^uP + ^ = --^, p = div('u) 
(5a = 
pvol = 0, 



(2c) 



M 



where the "pressure" p G C°°{M) is a Lagrangian multiplier, determined uniquely by the 
two constraint equations. 
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Notice from equation (2c) that if o"(to) = at some time to, then (T(to) = 0. As a con- 
sequence, grad(3"(M)) is an invariant subspace of equation (2), so if u(fo) € grad(9"(M)) 
then u{t) G grad(3"(M)) for all t. From a geometric point of view, the reason for this is 
that the corresponding right invariant metric on Diff (M) descends to the homogenous 
space Diffvoi(M)\Diff(M) ~ Dens(M), as is described in §4. In contrast, /?(to) = does 
not imply that p{t) = 0, so X^o[{M) is not an invariant subspace. However, if p{to) = 
then it follows from equation (2c) that /6(to) is arbitrarily small for large enough (3. In 
the case 7 = 0, this observation suggests that solutions to equation (2) may converge to 
solutions of the Euler-a fluid equation as /3 — )• oo, which is to be investigated in future 
work. We do not expect good behaviour of solutions as /3 — >■ 0, since A is not invertible 
for /? = 0. 

Equation (2) is a higher dimensional generalisation of the /xHS equation, studied by 
Khesin, Lenells, and Misiolek [18]. Indeed, if M = 5^ then Xvoil^^) = Xj£(S'i) ~ M 
consists of the constant vector fields on S^. Equation (2c) then becomes 

i + 2e.Ux = -Px 
Ux + uuxx + iz\Ux) = --0- 

From the first equation it follows that 

= / {i + 2iux +Px) dx = ip{S^) where /i(5'^) = dx 

which implies that ^ = 0. If the second equation is differentiated with respect to x we 
get 

Uxx + 2UxUxx + UUxxx- ^ . 

Since /_^i udx = J^i ^dx it follows that ^ is the mean of u over S^, i.e., 

e = Mn):=^/^^ndx. 
Thus, we finally arrive at 

"^XX I ^'^x'^XX I '^'^XXX — n 

which is the ;^HS equation. 

A different generalisation of /xHS, from M = S^ to M = T" (the n-dimensional flat 
torus), is given by Kohlmann [21]. The equation suggested there is also an Euler- Arnold 
equation, hence a geodesic equation on Diff(T"'). The corresponding right invariant 
metric does not descend to density space. 
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The geodesic equation for a right (or left) invariant metric on a Lie group G can be 
reduced to an equation on the Lie algebra g, called an Euler-Poincare or Euler-Arnold 
equation. The abstract form of this equation, first written down by Poincare [30], is 

Ait + adl{Au) = 0, (3) 

where yi : g — )• g* is the inertia operator induced by the inner product on q corresponding 
to the right invariant metric, and ad* : 0* — )• 0* is the infinitesimal action of u on 0*, i.e., 
the dual operator of ad„ : — J- 0. 

In our case, G = Diff(M), = X{M) and ad^ = — £„, i.e., minus the Lie derivative 
(acting on vector fields). We identify the dual of X{M) with Q.^(M) via the pairing 



{m,u) = / iu?TT,vol= {m,v )i2. 

JM 



Next, we introduce an inner product on X(M), whose inertia operator is given by (1). 
Indeed, a non-degenerate inner product on X{M) is given by 

{u,v)al3-f = {P-yU , P-yV ) L^ + a{du ,dv ) ^2 + I3{5u , 5v )l2, (4) 

where P^ = jR + (1 — ^)P is introduced to simplify the notation. Notice that (4) is 
different from the Sobolev a-b-c inner product, recently considered in [19], since only the 
divergence free components occur in the first term. By using that {a,6b)i^2 = (da, 6)2,2 
and {P^u^ , P^v^) ]^2 = {Pjv}' ,v^)]^2 we get 

{u, v)af5j = {P-yU + a 5du + (3 d5u , v )i2 = {Au, v), 

so A in (1) is the inertia tensor corresponding to the inner product (4). 

Using this inner product, we define a right invariant metric {{■,-))ai3-y on Diff(M) by 
right translation of vectors to X(M) = Ti^Diff (M). Explicitly, 

{{U,V))^p^ = {Uo^-\Voy,-^)^p^, (5) 

for C/,y Gr^Diff(M). 

In the special case M = 5*^ we have that Diffvoi(5'"'^) = Rot(5'"'^), i.e., the one dimen- 
sional manifold of rigid rotations. Thus, Xvoi(5'^) ~ M consists of the constant vector 
fields on S^. The inner product (4) on X{S^) then becomes 



{u,v) jji = uds vds+ UxVxds 



which is exactly the inner product defining the //HS metric. Therefore, the a-/3-7 met- 
ric (5) on Diff (M) is a generalisation of the /xHS metric on Diff (S"^) to arbitrary compact 
manifolds. 



3 Local existence and uniqueness 
The dual operator of — £« : X(M) -^ X(M) is computed as 



(m, -Luv) = - niA -k{Luv) = - m A i£„t,vol 
Jm Jm 



/ m A [LuiyVol — i„£„vol) 



Lu{m A i^vol) + / Lum A (it,vol + div(u)i^vol) 
M Jm 





(£„m + div(n)m) A i^vol = {Lum + div(n)m-, v). 
'M 

As a result, ad* (m) = Lu^n + div(n)m. From (3) we then obtain the following. 

Proposition 2.1. Equation (2) is the Euler- Arnold equation for the geodesic flow on 
Diff(M) with respect to the right invariant a-/3-7 metric (5). 

3 Local existence and uniqueness 

In this section we show that equation (2) is well posed as a Cauchy problem. The 
approach is that of Ebin and Marsden [7], which is to prove that the geodesic spray 
corresponding to the a-fS-'j metric (5) is smooth with respect to Sobolev H^ topologies. 
Let A^ be a smooth finite dimensional manifold. If s > n/2 then the set H^{M,N) 
of maps from M to A^ of Sobolev differentiability H'^ is a Banach manifold (in fact, 
H^{M,N) is a Hilbert manifold, but that is not relavant for our analysis). Let 7ri\i : 
TN -^ N he the canonical projection. The tangent space at / G H^{M,N) is given by 

TfH%M, N) = {ve H'{M, TN)-7rN ov = f}. 

Thus, TH^{M,N) = H''{M,TN). By iterating this we obtain, for higher order tangent 
spaces, that T^H''{M,N) = H{M,T^N). 

If s > n/2 + 1, which we assume throughout the remainder, then Diff*(M), i.e., the 
set of bijective maps in H^{M,AI) whose inverses also belong to H^{M,M), is an open 
subset of H^{M,M), and therefore also a Banach manifold. Since Diff'^(M) is open 
in H'{M,M), it holds that r^Diff^(M) = T^H%M,M). In particular, ridDiff"(M) = 
j£^(M), i.e., the vector fields on M of Sobolev type H^. 

IftpG Diff''(M), then right multiplication Diff''(M) 3 (p ^ tpo^) G Diff^(M) is smooth. 
However, Diff'^(M) is not a Banach Lie group, because left multiplication is not smooth. 
Instead, Diff*(M) is a topological group, i.e., the group operations are continuous. For 
details, see [7, § 2]. 

Let us now introduce the lifted inertia operator, given by 71" := jj o yi, with A defined 
by equation (1). We then have the following. 

Lemma 3.1. A^ is a smooth isomorphism X*(M) — )• X^~'^{M). 
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Proof. Let R : Qi'^(M) -^ J{\M) and Pex : ^^'^(M) -^ 6n^''+\M) be the Hodge 
projections onto 'K^{M) and 6Q^'^^^{M) respectively. These are smooth mappings since 
the Hodge decomposition of J7^'*(M) is smooth (see [7]). Thus, since the musical iso- 
morphism is smooth, the operator Z : u ^ {Ru ,PexU ,div{u)) is a smooth mapping 
je^(M) -> J{i(M) X 50i'*+i(M) X 3"*-i(M). For every s, it is in fact an isomorphism, 
so it has a smooth inverse which is given by {h, a, p) i— >■ /i" + ct" + grad(A~^(/))). (Notice 
that A-i : J^-i(M) -^ J^+i(M).) 

From the definition (1) of yi it follows that 

A^ = Z-^ o (id, (1 - 7)id - qA, -/3A) o Z 

This is an isomorphism X^^M) -^ X''-'^{M) since ((1 - 7)id - aA) : SQ^'^'+^iM) -> 
(5Q^'*-^(M) and A : g'^-^(M) -^ 3"''~3(M) are isomorphisms (see [32]). The inverse is 
given by 

{A^)-^ = Z-^ o (id, ((1 - 7)id - aA)~\ -^A-i) o Z. 
This concludes the result. D 

From the definition of u it follows that u{ip{x)) = (p{x) for x € M. By differentiating 
this with respect to t, we obtain 

^ {u{ip{x))) = cp{x) e rf^(,),^(,))M. (6) 

The Levi-Civita connection V, induced by the Riemannian metric g on M, defines a 
diffeomorphism between the second tangent bundle T'^M and the Whitney sum TM ® 
TM by (c, c, c) i-4- (c, c, Vc c). By pointwise operations, this identifies the second tangent 
bundle r2Diff'(M) with the Whitney sum rDiff'(M) rDiff'(M). By the w-lemma 
(see e.g. [7]) the identification is smooth. Using this identification, and the fact that 
u = (f o (p~^, we can express equation (6) as 

ii + VuU = {^-Ti'fj ° f~'^, 

where -^ip{x) := ^ip(x)^{x) is the co-variant derivative along the path itself. We can 
now write equation (2a) as 

A« ('(^^) o tp-A = -{LuAuf - {Ah) div(n) + A^VuU =: F{u). (7) 

The approach is to show that this defines a smooth spray on Diff'^(M), i.e., a smooth 
vector field 

S : TDiff*(M) -^ T'^DiE%M) ~ rDiff'^(M) rDiff^(M). 

Let R^ : Diff''(M) -^ Diff''(M) denote composition with ^ G Diff''(M) from the 
right, i.e., R^{ip) = 99 o -0. As already mentioned, this is a smooth mapping, so the 
corresponding tangent mapping TR^, given by T^DiS'^ (M) 3 v ^^ voip & T(^oi/'Diff'^(M), 
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is also smooth. Let TDiff^^ (M) f Diff^(M) denote the restriction of the tangent bundle 
TDiff^" (M) to the base Diff''(M). This is a smooth Banach vector bundle (see [7, 
Appendix A]). U B : X^{M) -^ X*~^(M) then we denote by B the bundle mapping 
rDiff"(M) -^ TBiS'-^{M) tDiff"(M) given by 

B{<f, ^) -^ (^, B^m, B^{^) := TR^oBo TR^-i. 

If B is smooth, then the mapping Bip : r<^Diff'^ — )• T^DiS^^ (M) is smooth for fixed 
ip G Diff*(M). However, in general, even if B is smooth, the mapping B need not be 
smooth. This is because the operation if i— )• c/?"^ is not smooth. However, the following 
key lemmas resolve the situation in our specific case. 

Lemma 3.2. The mapping 

i^ : TDiff"(M) -^ TBiS''^{M) tDiff"(M) 
is a smooth vector bundle isomorphism. 

Proof. We have 

A^ =R^ + {1- 7)p4 + W 

where Pjx = tt ° Pex ° ^j R = '^ o R o\) are the lifted Hodge projections and W = 
a ttoJodob-/? grad o div. PL and R^ are smooth bundle maps rDiff''(M) -^ TDiff^(M), 
see [7, Appendix A, Lemmas 2,3,6]. Thus, Pl^ and w are also smooth as mappings 
rDiff^(M) -^ rDiff'-2(M) tDiff^(M). That W is smooth follows from [7, Appendix A, 
Lemma 2]. 

In a local chart in a neighbourhood of (/9 € Diff*(M), the derivative of A^ at (c/?, ip) is 
a smooth linear mapping of the form 

id U \ ,^„.„o/,,N ,T,T^.^C/,,N ,T,„.^0/,,N r„ ^^.r„S — 2l 



* J\-ip 



r<^Diff'(M) X r^Diff*(M) -^ r<^Diff'(M) x r^Diff*"^(M) 



It follows from Lemma 3.1 that A\, is a linear isomorphism, with smooth inverse given by 

{A^)^ . The result now follows from the inverse function theorem for Banach manifolds. 

D 

Lemma 3.3. Let B : X''^{M) -^ X'^~^{M) be a smooth linear differential operator of 
order k. If s > n/2 + k, then the mapping 

u I — > BS/uU — VuBu = [B, S/u]u 

is a smooth non-linear differential operator X^{M) — > X^~^{M). 

Proof. If / and g are a scalar differential operators of order k and / respectively, then 
[f,g] is a scalar differential operator of order A; + / — 1, since the order k + / differential 
terms in the commutator cancel each other. In general, this is not true for vector valued 
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differential operators. However, for a fixed v, tlie linear operator u i— > V^u is given in 
components by 

so the part of V„ that is differentiating is acting diagonally on the elements of u. We 
write VyU = Gu + f{u'^)ei, where G : X*(M) — > X*(M) is tensorial and / is a scalar 
differential operator of order 1. li B = (6*), so that B{uidi + . . . + Undn) = bUu^)ei, 
then 

[B,V,]u = [B,G]u + {[f,h^]u^)ei. 

Since G is tensorial, [B, G] is a differential operator of the same order as B, that is k. 
Since / and bij are scalar differential operators of order 1 and k, it holds that [/, bij] is 
of order k + 1 — 1 = k. Since V^Bu differentiates v zero times, and BV^u differentiates 
V at most k times, it is now clear that the total operation u i— >■ [B, V^^i] differentiates u 
at most k times. This finishes the proof. D 

Lemma 3.4. Let B : X''^{M) — > X^~^{M) be a smooth linear differential operator of 
order k. If s > n/2 + k, then the mapping 

B : rDiff"(M) -^ rDifF"-^(M) [Diff"(M) 

is a smooth bundle map. 

Proof. Assume first that B is of order 1. Then, locally, B({p, ip) is constructed by rational 
combinations of if^^if^, -^ , -^ ■ Smoothness then follows since pointwise multiplications 
are smooth operations (see [7, Appendix A, Lemma 2]). We can now, at least locally, 
decompose B into the composition of first order operators, so that B = Bi- ■ ■ B^.. It 
then holds that B = Bi- ■ ■ Bk, and by the argument above, each Bi is smooth and drops 
differentiability by 1. This finishes the proof. D 

Remark 3.5. Notice that if Q : X'(M) x X'^-^M) -^ X"'^ is a bilinear differential 
operator, of order fc > in its first argument and order /c — / > in its second argument, 
then Lemma 3.4 implies that 

Q : TDiff"(M) X rDiff"-'(M) fDiff"(M) -^ rDiff"-^(M) tDiff"(M) 

is smooth whenever s > n/2 + k. 

Lemma 3.6. If s > n/2 + 2, then the mapping 

F : TBiS'{M) -^ TDiS'-^{M) tDiff"(M) 

is a smooth bundle map. 
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3 Local existence and uniqueness 

Proof. For any v G X(M) we have (L^v )* = LuV + 2Def(n)t;, where Def(n) is the 
type (1,1) tensor defined by ^(£ug)(t>, •) = g(Def (u)u, •) (see e.g. [32, §2.3]). Thus 

F{u) = -{LuAuf - {A^u)d\-v{u)+A^VuU 

= -LuAK - 2Dei{u)AK - {AK) div(n) + A^VuU 

= -VuA^ + Vj^i^u - 2T>ei{u)A^u - {A^u) div(u) + A^VuU 

= { A}Wu - Vn^V + Vyij„u - 2Dei{u)AK - {AK) div(u), 

where we have used that L^v = V^i; — V^u. 

Let Q : X^(M) x X''-'^{M) -^ X'-'^iM) be the bilinear mapping 

Q{u,v) := V„u — 2Def(u)f — f div(n). 

Notice that Q is tensorial in v and of order 1 in n. If s > n/2 + 2 then Q is smooth. 
Write yi' = P + W, where P = w -\- {1 — 7)-Pex and W^ is a linear differential operator 
of order 2 as above. We now have 

F(n) = [aK Vjn + Q{u,AK) = [P, Vj-u + [W, V„]n + Q{u, Pu) + Q{u, Wu). 

The approach is to show that each of these terms are of maximal order 2 and smooth 
under conjugation with right translation. 
For the first term, we have 

[P^.)]=PoV^)-V^)oP 

We already know that P : TDiff*(M) — > TDiff*(M) is smooth. From Lemma 3.4 and 
Remark 3.5 it follows that V^) : TDiff"(Af) -^ TDiS'-^M) tDifr"(M) is smooth. 

For the second term, (u, v) i-^ [W, Vt,]ti is a bilinear differential operator. From 
Lemma 3.3 it follows that it is of order 2 (since W is of order 2). From Lemma 3.4 

and Remark 3.5 it then follows that [VF, V(.)] is smooth. 

For the third term, it follows from Lemma 3.4 and Remark 3.5 that Q is smooth of 
order 1. Since P is smooth of order 0, we get that Q{-,P ■) is smooth of order 1. 

For the fourth term, [u, v) i— )• Q{u, Wv) is a bilinear differential operator of order 1 

and 2 in its arguments. It then follows from Lemma 3.4 and Remark 3.5 that Q{-, W ■ ) 
is smooth of order 2. 

Altogether, we now have that F : rDiff^(M) -^ TDiff^^^^^j) fDiff'^(M) is smooth, 
which finishes the proof. D 



Equation (7) can be written 



A^{ip,^^^)=F{ip,^). (8) 



We now obtain the main result in this section. 
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4 Descending metrics and the space of densities 

Theorem 3.7. If s > n/2 + 2, then the geodesic spray 

S : TDm%M) B ((^, if) ^ (if, if, {{A^)-^ o F) {ip, if)) € rDiff"(M) rDiflF^(M) 

corresponding to the a-f3-^ metric (5) on Diff*(M) is smooth. 

Proof. Follows from Lemma 3.2 and Lemma 3.6. D 

In turn, this result implies that the geodesic equation is locally well posed, and that 
the solution depends smoothly on the initial data. 

Corollary 3.8. Under the same conditions as in Theorem 3.7, the Riemannian expo- 
nential 

Exp : rDiff^(M) -^ Diff^(M) 

corresponding to the a-fS-j metric (5) on DifF*(M) is smooth. Further, if (p (z Diff*(M) 
then 

Exp^ : r^Diff''(M) -^ DifF^(M) 

is a local diffeomorphism from a neighbourhood of to a neighbourhood of (p. 

Proof. Follows from standard results about smooth sprays on Banach manifolds [22]. D 

4 Descending metrics and tlie space of densities 

Let vr : -E ^ -B be a smooth fibre bundle. In this section we characterise pairs of 
Riemannian metrics on E and B for which the projection vr is a Riemannian submersion. 
We do this in three steps, by introducing more and more structure to the fibre bundle: 

1. First, the plain case tt : E —^ B. A basic characterisation of all the descending 
metrics is given. 

2. Second, the case when ir : E —?■ B is a principle i7-bundle. This allows us to 
characterise descending metrics in terms of ff-invariance. 

3. Third, the case when E = G, where G is a Lie group, and B is a G-homogeneous 
space, i.e., there is a transitive Lie group action of G on B. Then the projection 
TTh : g >-^ b- g, for any fixed element b £ B, defines a principle Gfo-bundle, where Gb 
is the isotropy group of b. This structure allows us to consider metrics on G which 
are both right invariant and descending. 

The main example is G = Diff(M) and B = Dens(M), i.e, the space of smooth 
densities on M (see below). The main result is that the a-(3-^ metric (5) on Diff(M) 
descend to the right invariant canonical Lp' metric on Dens(M) (the Fisher metric). 

Let us begin with the plain case. The kernel of the derivative of the projection map vr 
defines the vertical distribution V on E^ i.e., for each x £ E 

V^ = {v£T^E;T^Tr-v = 0}. 

ligE is a Riemannian metric on E, then we can also define the horizontal distribution Ji = 
V as the orthogonal complement of "V with respect to g^;. 
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4 Descending metrics and the space of densities 

Definition 4.1. A Riemannian metric gE on E is called descending if there exists a 
Riemannian metric gs on B such that 

gsiu, V) = TT*gB{u, V) yU,V e'K. 

Thus, a metric gE on E is descending if and only if there exists a metric gB on B 
such that TT is a Riemannian submersion, i.e., Ttt : TE -^ TB preserves the length of 
horizontal vectors. 

If gE is a descending metric, then the metric gs is unique. This follows since 

Trvr : "Kx — > T^{x)B 

is an isomorphism for each x € E. 

We now show how to construct descending metrics. Let g^ be any Riemannian metric 
on B. Then we can lift ge to a positive semi-definite bilinear form 7r*gB on E. Next, 
let h be another positive semi-definite bilinear form on E such that ker(h) (iV = {0} and 
the co-dimension of ker(h) is equal to the dimension of V. Then 

gE = vr*gB + h (9) 

is a descending Riemannian metric on E. Notice that ker(7r*gs) = V and ker(h) = 'K. 
Thus, gE{u,v) = TT*gB{u,v) for all u,v G "K, so gE is indeed descending. Notice also 
that the horizontal distribution is independent of the choice of g^. 

The form (9) characterises all descending metrics. Indeed, if gE is a descending metric, 
let gB be the corresponding metric on B and let P : TE — )• V be the orthogonal projection 
onto V with respect to gE- Then gE is of the form (9) with h{u, v) := gE{u, Pv). 

Consider now the second case. That is, let -ff be a Lie group and consider the case 
when IT : E ^ B \s & principle -ff -bundle, with a left action L^ : E ^ E for h € H. Being 
a principle bundle, the fibres are parameterised hy H , so n o Lh = tt and if 7r(x) = 7r(y) 
then there exists a unique h ^ H such that y = Lh[x). Thus, if gB is a Riemannian 
metric on B, then ■K*gB = (vr o Lh)*gB = L*^'K*gB- It follows that if gE is a descending 
metric, then 

LlgE{u, v) = gE{u, v) Vu, ven. (10) 

The converse is also true. 

Proposition 4.2. Let gE he a Riemannian metric on E. Then gE is descending if and 
only if it fulfils (10). 

Proof. We have already shown => so <^ remains. Assuming (10), define gB in the 
following way. For u,v S T^mB, take any point y € -k~^{{x}). The linear map 
TyTT : Jiy — 7> T^ix\B is an isomorphism, so we get u,v G 'Ky hy u = Tyir^^ ■ u and 
V = TyTT"^ ■ V. Now, define gB by 

gB{u,v) :=gE{u,v). 

This is a well define metric on gE, i.e., it is independent on which y € 7r^^({x}) we use. 
Indeed, for another y' € 7r^^({x}) we get u',v' G Jiy' as above. Also, y' = Lh{y) for 
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4 Descending metrics and the space of densities 

some h G H. From (10) it then follows that gE{u,v) = gE{u',v'), so gB is well defined. 
By construction, gE{u,v) = Tr*gBiu,v) for all u,v E !K, so gE is indeed a descending 
metric. D 

We now specialise even further to the third case. That is, let G be a Lie group with 
identity e. Denote by Lg and Rg respectively the left and right action of 51 G G on G. 
Assume that G has a right transitive action Rg on a manifold B. [B is then called a 
G-homogeneous space.) lib G B, then G^ = {5 € G;Rg{b) = b} denotes the isotropy 
Lie subgroup of G. For every 6 € B we then have a principle Gfe-bundle iri, : G ^ B, 
where iTbig) — Rgib)- Notice that this structure implies that B is diffeomorphic to the 
homogeneous space Gf,\G of right co-sets. The map vrfe, which is well defined on Gb\G, 
provides a diffeomorphism. (If b,b' € B we also have that G^ and G^/ are conjugate 
subgroups, i.e., there exists a 5 G G such that gGbg~^ = Gy-) 

We are interested in Riemannian metrics go on G which are right invariant, i.e., for 
which 

gG{u,v) = gciTRg ■ u,TRg ■ u) 

or equivalently Riga = Eg- Notice that right invariance does not imply that the metric 
is descending. Indeed, in order for a right invariant metric gc to be descending with 
respect to VTfe, it follows from Proposition 4.2 that 

LlR*ggG{u,v) = gG{u,v) Vn,7;e^^ V5 € G, V/i G Gf, 

where "K^ denotes the horizontal distribution. Since gQ is right invariant and since the 
right action descends to Gb\G, i.e., Rg maps fibres to fibres, it is enough to check the 
condition for g = h~^ and for vectors u,v G [Kg = g^, where Qb is the Lie algebra of Gb- 
Indeed, the following result is given in [19]. 

Proposition 4.3. Let go be a right invariant metric on G. Then gc is descending (with 
respect to lib) if and only if 

gG{ad^{u),v) +gG{u,ad^{v)) = yu,v e Q^, ^ € Qb- 

Consider now the reverse question, i.e., if go = TT*gB + h is a descending metric, when 
is it right invariant? Since right invariance means that g^ = Rtgc it must hold that 
R*g-iT*gB = TT*gB and i?*h = h. Also, since 

R*g7r*gB = (vr o RgYgB = {Rg o nYgB = ^*R*ggB 

we obtain the following result. 

Proposition 4.4. Let go = vr*gB-|-h be a descending Riemannian metric on G. Then go 
is right invariant if and only if both gB and h are right invariant, i.e., 

R*ggB = gB and Rgh = h 

for all g ^ G. 
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4 Descending metrics and the space of densities 

We now investigate what the geometric concepts investigated in this section imphes in 
the case G = Diff (M). 

First, we introduce the manifold of smooth probabihty densities on M, which takes 
the role of the manifold B above. It is given by 



Dens(M) = lu e J1"(M) ; u > 0, f u = l\ 



The tangent spaces of Dens(M) are Tj,Dens(M) = 0[^(M) := {a G J7"(M); /^ a = 0}. 

Diff (M) acts on Dens(M) from the right by puUback R,p{i') = ip*v. The corresponding 
lifted action is again given by pullback, i.e., TRip{u,a) = {(p*v,ip*a). 

Consider now the volume form vol € Dens(M), corresponding to the Riemannian 
structure on M. Using the action R^ and this volume form vol, we define the projec- 
tion map TTvoi : Diff(M) — )■ Dens(M) by vrvoi(v) = -Ri^(vol) = c/?*vol. This map is a 
submersion, since the action is transitive, which is proved by Moser [28]. Furthermore, 
the corresponding isotropy group is given by Diffvoi(M), i.e., if ^ € Diff vol (M) then 
TTvoKV' ° V') = '^m\{'^)- Accordingly, with Diffvoi(M) acting on Diff (M) from the left, we 
have the principle bundle structure 



Diff™i(M) -^ Diff(M) ^^^ Dens(M). (11) 

The vertical distribution V of this bundle structure is given by vectors in TDiff (M) which, 
right translated to TidDiff(M) = X(M), are divergence free vector fields. That is, 

V^ = {v(^ T^Diff (M) ; r; o ^-1 G X,o\{M)]. 

In reference to the abstract formulation above: G = Diff(M), B = Dens(M), and 
Gb = Diff™i(M). In particular, Diff™i(M)\Diff (M) ~ Dens(M). 

Remark 4.5. More specifically, it holds that Diff^„i(M)\Diff'(M) ~ Dens^'-^M) if s > 
n/2 + 1. Also, in this setting the projection t^^^i '■ Diff*(M) — > Dens''~^(M) is smooth. 
These results are then used to show that vTvoI : Diff(M) — t- Dens(M) is smooth with 
respect to the ILH topology. Notice, however, that the principle bundle structure tTvoI : 
Diff''(M) -> Diff* (M) /Diff ;„i(M) is only C°, since the left action of Diff'^(M) on itself 
is only continuous. See Ebin and Marsden [7, § 5] for details. 

By using the Nash-Moser inverse function theorem, Hamilton [13, §111.2.5] also showed 
directly that tTvoI • Diff(M) — > Dens(M) is a smooth principle Diffvoi(-^)^bundle with 
respect to a Frechet topology. 

Remark 4.6. As already mentioned in the abstract setting, the choice of reference 
element vol G Dens(M) has no canonical meaning. It simply specifies which point in 
Dens(M) we consider to be the "identity density". Indeed, if z/ G Dens(M) is another 
density, then Diffvoi(-M^) and Diffjy(M) are conjugate subgroups, i.e., there exists a tp G 
Diff (M) such that Diff^(M) = V o Diffvoi(M) o ifj-K 
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4 Descending metrics and the space of densities 

Consider now the question of right invariant and descending metrics on DiflF(M). First, 
there is a natural L^ metric on Dens(M), given by 

{{a,b))u= I '^^v, a,ben^iM), (12) 

where da/dv, db/dv G i7'^(M) are the Radon-Nikodym derivatives of a and h with respect 
to u. This metric is called the Fisher metric. As mentioned already, it is fundamental 
in the theory of information geometry. In addition, it is used in statistical mechanics, 
for measuring "thermodynamic length" (see e.g. [6, 10]), and in quantum mechanics (see 
e.g. [9]). Notice that the Fisher metric is canonical, in the sense that it is independent 
of the Riemannian structure on M. 

Remark 4.7. One can also write the Fisher metric (12) as 

{{a,h))y = / (*^a)6, 



JM 
where -ky : Q"'{M) — )■ Q^{M) is the Hodge star on n-forms corresponding to u. 



Proposition 4.8. The Fisher metric (12) on Dens(M) is invariant with respect to the 
action R^p. 

Proof. 

{{(f*a,(f*b))^*y = / {-k^*iy(p*a)ip*b= / ip* {{-kua)b) = {{a,b))y. 
Jm Jm 

D 



We now come to the main result of this section. It states that the a-jS-j metric on 
Diff (M) descends to the Fisher metric on Dens(M). 

Theorem 4.9. The a-f3-'j metric (5) on Diff(M) descends to a metric on Dens(M), 
which, up to multiplication with (3, is the Fisher metric (12). 

Proof. First, since the inner product (4) on X{M) corresponding to the a-j3-^ metric 
preserves orthogonality with respect to the Helmholtz decomposition, it follows that the 
horizontal distribution is given by 

^^ = [v(^ r^Diff(M) ; w o ^-1 G grad(J(M))}, 

i.e., vectors that, when translated to the identity, are given by gradient vector fields. 
Now, if u G Vid = 3evoi(M) and /, g G ^(M), then 

(£„grad(/),grad(5r))Q/3^ = / /3(5(£„ grad(/))^5grad(5')^ vol 



M 



M 



/3 di£„ grad{/) vol A -k digrad(3) vol 



M 



/3 digrad{/)VOl A :*rdi£^ grad(g)Vol 



-(grad(/),£„grad(ff))a/3^, 
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5 Optimal transport and polar factorisation 

where we have used that ^^vol = 0, and that L^ * o = *£nffl foi' any a G i}'^{M), which 
also follows since u € X^oliM). From Proposition 4.3 it now follows that the a-/3-7 metric 
is descending. 

The tangent map TtTvoI restricted to TidDiff (M) = X(M) is given hy u >-^ ^^jVol. Now, 



(grad(/), grad(s'))„/3^ = / /3 igrad{/)Vol A •igrad(g)Vol 

Jm 

/3 ■^grad(/)V0l A •£grad(/)V0l 
M 

= /?(('^grad(/)V0l, '^grad{g)V0l))vol- 

Therefore, the a-(3-^ metric for horizontal vectors at the identity tangent space is given by 
the Fisher metric multiplied by /3 of the projection of the horizontal vectors to the tangent 
space rvoiDens(M). Since this holds at one tangent space, it follows from Proposition 4.8 
that it holds at every tangent space (both the a-f3-'y metric and the Fisher metric are 
right invariant). This concludes the proof. D 

As a consequence of this result, we now obtain a geometric explanation of the obser- 
vation in § 1.1, that solutions which are initially gradient vector fields remain gradients. 
This is a consequence of a general property on Riemannian submersions, proved by Her- 
mann [14], which is that initially horizontal geodesies remain horizontal. 



Remark 4.10. The "components" gB and h for the a- (3-^ metric are identified as follows: 

)l2) 



(w,f)a/37 = {P^iu\P^v^)L2+a{du\dv^)L2+P{5u\5v')i 



h 7i-*gB 

with the same notation as in equation (4). 

5 Optimal transport and polar factorisation 

The field of optimal transport has a long history, going back to Monge [27] and Kan- 
torovich [17, 16]. For an overview of the subject, see, e.g., the the monograph by Villani 
[33], or the lecture notes by Evans [8] or McCann [26], and references therein. 

In this section we study the relation between optimal transport problems and metrics 
on Diff(M) descending to Dens(M). Typically, optimal transport problems are consid- 
ered with minimal restrictions on the regularity of maps and densities. Our setting is 
restricted to the smooth case (or more precisely Sobolev H^). The point of view is that 
of Otto [29, § 4], but with the difference that Dens(M) is identified with right instead 
of left co-sets (see §4). We also discuss the correspondence between optimal control 
problems and polar factorisation. The main result is given in §5.1, where we establish 
existence and uniqueness of the optimal information transport problem corresponding to 
the a-/3-7 metric (5), and a matching polar factorisation result for Diff^(M). In addition, 
as a finite dimensional analogue, we show in §5.2 that the QR factorisation of square 
matrices can be seen as a polar factorisation result corresponding to optimal transport 
of inner products on M". 
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5 Optimal transport and polar factorisation 

From a geometric point of view, one can define optimal transport problems abstractly 
as follows. Let G be a Lie group with identity element e, which is acting transitively 
on a manifold B. Assume that G is equipped with a cost function c : G x G ^ M^. A 
simple geometric formulation of Monge's original optimal transport problem is: 

Given b, b' € B, find k €z {g (z G; g ■ b = b'} minimising c(e, k). (13) 

We are interested in the case when the cost function is c(-, •) = distQ(-, •), where distc 
is a geodesic distance corresponding to a Riemannian metric gc on G (we assume that 
G is connected, so that (G, distc) is a metric space). In particular, consider the case 
when gG is descending with respect to a fibre bundle structure tt : G ^>- B. Then, loosely 
speaking, the optimal transport problem reduces to: (i) finding a shortest curve on B 
that connects b and 6', and (ii) lifting that curve to a horizontal geodesic in G and take 
the endpoint as the solution to (13). The basic reason is that a shortest curve (or a 
sequence of curves attaining the infimum length in the limit) between e and the fibre 
7r~^({6'}) must be horizontal. Indeed, we have the following result. 

Lemma 5.1. Let tt : G ^- B be a Riemannian submersion, and let (^ : [0, 1] — )■ G be an 
arbitrary curve. Then there exists a unique horizontal curve C,^ : [0, 1] — )• G such that 
C/i(0) = C(0) ^^'^ TT o C = vr o ^/j. The length of C,h is less than or equal the length of (^, 
with equality if and only if ^ is horizontal. 

Proof. For each t G [0, 1] there is a unique decomposition ((t) = v{t) + h{t), where 
v{t) £ '^((t) s-iid h{t) € ^f(t)- Thus, we have the curves t i— )• v{t) G V and t i— )• h{t) £ 
"K. By the projection vr we also get a curve ((t) = 7r(C(i)) G B. This curve can be 
lifted to a horizontal curve as follows. Take any time-dependent vector field Xf on B 
for which C, is an integral curve, i.e., C,{t) = Xt{C,). Now lift Xf to its corresponding 
horizontal section Xt{g) = (Tgir)^^ ■ XtiT^id))- (We can do this since Tgir : Vg — )■ T^ig\B 
is an isomorphism.) Next, let Q^ be the unique integral curve of Xt with C/i(0) = C(0)- 
By construction it holds that Ti{C,h{t)) = 7r(C(t))- By construction we also have that 
gG{C,h{t)Xh(t)) = gG{h{t),h{t)). Thus, gG{Ch{t)Xh{t)) < gG(.C{t),C(.t)), with equality if 
and only if C(i) G Ji. 

It remains to show that C/j is unique. Assume that ^^ : [0, 1] ^^ G is another horizontal 
curve such that t^ o C'h = C ^^'^ Ch(0) = C(0)- By differentiation with respect to t we 
obtain 

Tc.it)^ ■ Chit) = m = Mm) = MAC'hit))) = Tc^it)^ ■ XtiChit))- 

Since C^ is horizontal, it follows that C,'^ is an integral curve of Xt. Since C,'^ fulfils the 
same initial condition as Qh, it follows from uniqueness of integral curves that ^^ = C,h^ 
which concludes the proof. D 

Remark 5.2. Notice that the result in Lemma 5.1 holds in the case when G and B are 
Banach manifolds, with a smooth bundle structure tt : G ^ B. It is not necessary that G 
is a Banach Lie group, i.e., that the group operations are smooth. This is important for 
the main example in § 5.1. 
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5 Optimal transport and polar factorisation 

For a cost function corresponding to a descending Riemannian metric, Lemma 5.1 
shows that the optimal transport problem (13) reduces to a problem entirely on B, namely 
to find a shortest curve between two given elements b" ,b' € B. It is not always the case 
that this problem is easier to solve. However, if the geometry of the Riemannian manifold 
{Bjgs) is well understood, for example if any two elements in B can be connected by a 
minimal geodesic, then the problem simplifies significantly. 

Also related to optimal transport problems is the concept of polar factorisation. Indeed, 
following Brenier [4], we introduce the polar cone as the subset of G given by 

K = {keG;d[stG{e,k) < distG(/i, A:), V/i G GJ. 

Put in words, the polar cone is given by the elements in G for which the closest point on 
the identity fibre is e. We have the following result. 

Proposition 5.3. Let go be a metric on G which descends to a right invariant metric 
gB on B with respect to the fibre structure iTb{g) = Rg{b) for some fixed b & B. Then the 
following statements are equivalent: 

1. If b' G B, then there exists a unique minimal geodesic from b to b' . 

2. Ifb",b' € B, then there exists a unique minimal geodesic from b" to b' . 

3. There exists a unique solution to the optimal transport problem (13), and that 
solution is connected to e by a unique minimal geodesic. 

4- Every g (z G has a unique factorisation g = hk, with h & G^ and k ^ K , and every 
k & K is connected to e by a unique minimal geodesic. 

Proof. 1 =^ 2. Since go is right invariant it holds that if (^ : [0, 1] — )■ G is a minimal 
geodesic, then so is Rg o (^ for any g ^ G. Since the action R is transitive, b" = Rg(b) for 
some g £ G. 

2 =► 3. Let ^ : [0,1] —^ B be the minimal geodesic from b to b'. Then, by Lemma 5.1, 
there is a unique corresponding horizontal geodesic C : [0, 1] -^ G with ^(0) = e and 
7rfe(C(t)) = C(^)- There cannot be any curve from e to vr^ ({&'}) which is shorter than 
^, because then ^ would not be a minimal geodesic. A curve from e to vr^ {{b'}) of the 
same length as (^ must be horizontal (follows from Lemma 5.1), and therefore equal to ( 
(which also follows from Lemma 5.1). Thus, if g G tt'^ ({^'})\{C(1)} then distG(e, g) > 
distG'(e, C(l))) so C(l) is the unique solution to problem (13). Also, (^ is a unique minimal 
geodesic between e and C(l)- 

3 =^ 4. Let k be the unique solution to (13) with b' = TTb{g)- Then k and g belong 
to the same fibre, so g = hk for some unique element /i G G^. There cannot be another 
such factorisation g = h'k', because then k would not be a unique solution to (13). Now 
take any k £ K. Then k is the unique solution to (13) with b' = TTt,{k), and that solution 
is connected to e by a unique minimal geodesic. Thus, any k (z K is connected to e by a 
unique minimal geodesic. 

4 => 1. Since the action R is transitive, we can find a g £ G such that, for any 
b' G -B, it holds that b' = Rg{b) = iTb{g). Let g = hk he the unique factorisation, and 
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5 Optimal transport and polar factorisation 

let C : [0, 1] — >■ G be the unique minimal geodesic from e to k. Assume now that (" is 
not horizontal. Then, by Lemma 5.1, we can find a horizontal curve ^h • [0,1] — >■ G 
with Ch(0) = k and C/i(l) ^ Gf, which is strictly shorter than ^. Since (^ is a unique 
minimal geodesic between e and k, it cannot hold that Ch{^) = e. But then we reach a 
contradiction, because it cannot then hold that k & K, because there is a point C/i(l) on 
the identity fibre which is closer to k than e. Therefore, ^ must be horizontal, and so 
it descends to a corresponding geodesic ( between b and b'. C, must be unique minimal, 
otherwise (^ cannot be unique minimal. This finalises the proof. D 

Remark 5.4. If the metric gB is not right invariant, then Proposition 5.3 is still valid 
in the case b" = b. 

5.1 Optimal information transport 

We now consider the main example in this paper, namely G = Diff'^(M) equipped with 
the a-/3-7 metric (5) and B = Dens*~"'^(M). As derived above, the a-(3-j metric descends 
to Dens'*" (M), where it is given, up to multiplication with the constant /3, by the Fisher 
metric. For simplicity, we assume throughout this section that (3 = 1. Recall that 
Rtp{T^) = ip*v and vrvoi(v') = V7*vol. Also recall that if s > n/2 + 1, then Diff*(M) and 
Dens'*" (M) are Banach manifolds, and the projection tTvoI : Diff*(M) — >■ Dens*~ (M) is 
smooth. Thus, all the prerequisites in Proposition 5.3 are fulfilled. 

It was shown by Khesin, Lenells, Misiolek, and Preston [19] that the geodesic problem 
on Dens'*" (M) with respect to the Fisher metric can be formulated as an optimal trans- 
port problem with respect to a degenerate cost function. However, as the cost function 
is degenerate, solutions are not unique, so there is no corresponding polarisation result. 
The a-/3-7 metric on Diff*" (M) allows us to obtain a non-degenerate optimal trans- 
port formulation in accordance with the framework above. In particular, we obtain a 
factorisation result for diffeomorphisms. 

Let A, i^ E Dens'^^ (-^^)- The following Monge-Kantorovich problem is considered: 

Find v? G {0 G Diff '* (M) ; 0* A = v} minimising dist^^^(id, ip). (14) 

Here, dist„^^ is the Riemannian distance corresponding to the a-/3-7 metric (5). Since 
the a-fi-'y metric descends to the Fisher metric, we refer to (14) as optimal information 
transport. 

Due to Proposition 5.3 it is enough to study geodesies on Dens'*~^(M) in order to 
solve (14). Also, since the a-/3-7 metric is right invariant, it is no restriction to assume 
that A = vol. 

As mentioned in the introduction, Friedrich [12] showed that the Fisher metric has 
constant curvature. This implies that its geodesies are easy to analyse. Indeed, follow- 
ing [19], we introduce the infinite dimensional sphere of radius r = y^voI(M) 



5" 



\M) = {/ G r{M)- {f,f)L2 = vol(M)}. 



If s > n/2 then this set is an /f* Banach manifold. The L^ inner product on 3'^(M) 
restricted to 5°°''*(M) provides a weak Riemannian metric. Although weak, it has a 
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geodesic spray which is smooth. The geodesies are given by great circles, so it follows 
that S^'^{M) is geodesically complete and that its diameter is given by vry'vol^M). 

Let (D'^{M) = {/ e S°°''^{M);f > 0} denote the space of positive functions of ra- 
dius y^vol(M). 0^{M) is an open subset of S°°'^{M), and therefore a Banach manifold 
in itself. The following result is given in [19]. 

Theorem 5.5. If s > n/2, then the map 

$ : Dens'" (M) 3 u t— 



du 



dvol 

is an isometric diffeomorphism. Dens*(M) -^ 0*(M). The diameter of 0'^{M), and thus 
also Dens^(M), is given by ^-y/vol(M). 

Notice that if /, g G O'^(M), then there is a unique minimal geodesic a : [0,1] — )• 
S°^'^{M) from / to g, and that geodesic is contained in 0^{M), i.e., 0^(M) is a convex 
subset of 5°°''^(M). Indeed, the minimal geodesic is given by 

rn -,1^., . sm((l-t)g) , , sin(tg) f {f,g)L^ \ .... 

ct: 0,1 9t^ —- -/+ . g, e = arccos I ^— — 1 . 15 

smt^ smt^ \vol(M)/ 

The polar cone of Diff''(M) with respect to the a-/3-7 metric is given by 

K%M) = {ipe Diff^(M);dist,;3^(id,(^) < dist,;3^(0, (^), Vc/. G Diff:„i(M)}. 

Since there exists a unique minimal geodesic between vol and any u € Dens'^" (-^)) 
it follows from Proposition 5.3 that every ip G K'^{M) is the endpoint of a minimal 
horizontal geodesic ^ : [0, 1] — >• Diff^(M) with ((0) = id. Since ^ is horizontal, it is of the 
form C(t) = Expid(tgrad(t(;o)) for a unique wq G 3"''+^(M), where Exp : rDiff''(M) -^ 
Diff^(M) denotes the Riemannian exponential corresponding to the a-/3-7 metric. 

Let if G Diff*(M). Due to the explicit form (15) of minimal geodesies in 0*~^(M), 
and thus also Dens*~ (M), we can compute the function wq G 3^^^^{M) such that 
Expjjj(grad(it;o)) is the unique element in K^{M) belonging to the same fibre as ip. 
Indeed, 

TtTtoI • grad(u;o) = :n _ 7rvoi(Expid(tgrad(u;o))) 



dt 
d_ 

dt 



_ vr™i(C(t)) = ;tt _ cT(t)2vol = 2a(0)a(0)vol, 



where a{t) is the curve (15) with / = 1 and g = y^Jac{ip) (recall that the Jacobian is de- 
fined by Jac('0)vol = ^*vol). Since a{0) = 1, and since TvTvoI • grad(t(;o) = '^grad(u;o)^°^ ~ 
AwqvoI, we get 



2ey^J^^_29com ^ ^ ^^^^^jJW^^^^Y (16^ 

sine V vol(Af) / ^ ^ 
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5 Optimal transport and polar factorisation 

Consider now the problem of finding the horizontal geodesic ^(t) = Expjj(tgrad(ii;o))- 
Of course, one can always solve equation (2) with u{0) = grad(tyo), and then recon- 
struct ((t) by integrating the non-autonomous equation (^(t) = u{t) o(^(t) with (^(0) = id. 
However, since we know the projected geodesic curve (15) explicitly, we may also pro- 
ceed by directly lifting that curve to a corresponding horizontal geodesic C(^)- Indeed, 
the geodesic C(t) in Dens*~^(M) corresponding to C(t) is given by ^{t) = ^~^{a{t)) = 
a"(i)^vol. Since 7rvoi(C(^)) = C(0 ^^^^l 

TR^^tyi{C{t),C{t)) = {id,C{t)oC{t)-') =: (id,gradK)), Wt € r+\M) 

we conclude that 

C{t) = Tid{R({t) o tTtoi) • grad(u;i) 

= C(t)*(£grad(«;t)^oO 

= div(grad(zi;i)) o ([t) Jac(C(i))vol. 

By using C(i) = 2a{t)a{t)vol and Jac(C(t)) = o"(t)^, we get 

2&{t)a{t) = {Awt o C{t))a{tf. 

The horizontal geodesic C(Oi ^-^^ its inverse Cinv{t) '■= C(^)^^) can now be constructed 
by solving the following non-autonomous ordinary differential equation 

C{t) = gTad{wt)oC{t), C(0) = id 
Uv{t) = -{TCit))-' ■ gradK), Uv{0) = id 

Aw, = ^oU,{t) (17) 

a{t) 

sin((l -t)6') sin(t6') r- — -— ^ / [., v^Jac((^)vol\ 

a(t) = . ^ '^ + . ' \/Jac(ip), 6' = arccos ■'^^ ^ ^, '^, . 

^ ' sin 6* sin 6* ^ vv-^' \^ vol(M) / 

The equation for Cinvit) is obtained by 

= ^ {m o at)-') = m ° m-' + Tm • A (^c(t)-i) . 

Notice that we already know that equation (17) has a unique solution for t G [0, 1]. 
As required, the equation is invariant with respect to any substitution if —^ cj) o ip with 
G Diffe„i(M). 

In summary, we have proved the following result. 

Theorem 5.6. Let s > n/2 + 1. Then every ip E Diff'^(M) admits a unique factorisation 
(^ = o -0, with (f) e Diff^oj(M) and ip G K^{M). It holds that ip = Expi(j(grad(wo)) 
with Wo given by equation (16). There is a unique minimal horizontal geodesic C(t) with 
C(0) = id and C(l) = tjj, which can be computed by solving equation (17). 

Remark 5.7. Notice that the polar factorisation in Theorem 5.6 does not depend on 
the parameters a, /3, and 7. The reason for this is that every parameter choice yields the 
same horizontal distribution, and the same horizontal geodesies. 



24 



5 Optimal transport and polar factorisation 

5.2 Optimal transport of inner products and QR factorisation 

In this section we show how the QR factorisation of square matrices is related to optimal 
transport of inner products on M". The example provides a finite dimensional analogue 
to optimal information transport described in §5.1 above. We do not rigorously address 
questions of global existence and uniqueness of geodesies (local existence and uniqueness 
follows automatically since we consider geodesies on a smooth manifold with respect to a 
smooth metric). Rather, the aim is to provide geometrical insight to the QR factorisation 
and its relation to the Cholesky factorisation. 

The setting is as follows. Let G = GL(n) over the field M and let B = Sym(n)+ be 
the manifold of inner products on M". Sym(n)"'" is identified with the space of symmetric 
positive definite n x n matrices, i.e., if M G Sym(n)^ is a symmetric positive definite 
matrix, then the corresponding inner product is (x,y)M = x My. Notice that Sym(n)"'" 
is a convex open subset of the vector space Sym(n) of all symmetric n x n matrices. 

The group GL(n) acts on Sym(n)"'" from the right by Ra{M) = A^ MA. This action 
is transitive, which follows (for example) from the fact that every positive definite sym- 
metric matrix has a Cholesky factorisation. If f/ G T/vfSym(n)"'", then the lifted action is 
TmRa ■ U = A'^UA. 

Let / denote the identity matrix (which is an element in both GL(n) and Sym(n)"'"), 
and consider the projection ttj : GL(n) -^ Sym(n)^ given by '7rj{A) = Ra{I) = 
A~^ A. The corresponding isotropy group is Gj = SO(n), which follows since ttj^QA) = 
A^Q~^QA = A^ A = TTiiA) for all Q G SO(n). Thus, we have a principle SO(n)-bundle 
TT/ : GL(n) -^ Syni(n)+. 

There is natural metric g^ on Sym(n)'^ given by 

gs,M(C/, V) = tr(C/M-V), [/, y G rMSym(n)+ = Sym(n). (18) 

This metric is invariant with respect to the action Ra- Indeed, 

Zb^r^{m){TmRa ■ U,TmRa ■ V) = triA'^UAiA'^MAy^A'^VA) 

= tr{{A'^MA)-^A'^UA{A'^MA)-^A'^VA) 
= tr{A-^M-^A-^A'^UAA-^M-'^A-^A'^VA) 
= tr{A~^M~'^UM~'^VA) 
(using cyclic property: ti(ABC) = tr^BCA)) 
= ti{M-'^UM-^VAA-^) 
= tr{M-^UM-W) 
= ti{UM-^V) = gB,MiU,V) 

We now proceed by defining a metric on GL(n). Consider the projection operator 

I : Mat(n,n) — )• Mat(n,n) given by 

otherwise. 
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Thus, i{U) is the matrix which is equal to U at the strictly lower triangular entries, and 
zero at the other entries. Next, let go be the right invariant metric on GL(n) which at 
the identity is given by 

gG,i{u,v) =tT{i{u)'^i{v))+ti{{u + u^){v + v'^)). (19) 

By right translation we then have gG,Ai^^^) ~ &G,l{UA~^,VA~^). Notice that the 
orthogonal complement of TfS0(?7-) = so(n) with respect to this metric is given by the 
upper triangular matrices, which follows since matrices in so(n) are skew symmetric, 
so the second term in (19) vanishes if either u or w belong to so(n). In other words, 
so(n)^ = upp(n) := {u G Ql{n);£{u) = 0}. 

Proposition 5.8. The right invariant metric go on GL(n) is descending with respect 
to TTj. The corresponding metric on Sym(n)^ is given by gs- 

Proof. By Proposition 4.3 we first need to show that 

gG,i{ad^{u),v) +gG,i{u,ad^{v)) = 0, yu,v G upp(n),^ G so(n). 
We have 

gG,i{ad^{u),v) = tr {{[(,u] + [^,u]^){v + v'^)) = tr {{[^,u + u^]){v + v'^)). 
By using the cyclic property of the trace we then get 

tr (([e, u + u^]){v + v'^)) = - tr ((n + u^){[^, v + v'^])) 

= -gG,iiu,ad^{v)). 
Therefore, the metric is descending. Next, if u G upp(n), then T/vr/ ■ u = u + u , so 

gG,/(n, v) = -tr {{u + u^){v + w^)) = gB,i{TiTTi ■ u, Tittj ■ v). 

Since gs is right invariant it follows that gc descends to g^, which proves the result. D 

The horizontal distribution IK is given by ^i^ = upp(n)^. Since upp(n) is a Lie algebra, 
i.e., it is closed under the matrix commutator, it holds that the horizontal distribution is 
integrable. Its integral manifold through the identity is given by the Lie group of upper 
triangular n x n matrices whose diagonal entries are strictly positive. This Lie group is 
denoted Upp(n). 

Let A G GL(n). If there exists a unique minimal geodesic ^ : [0, 1] — ?> Sym(n)"'" from 
/ to 7r7(yl), then, in accordance with the framework above, we obtain a factorisation 
A = QR, with Q G SO(n) and R G Upp(n). 

Remark 5.9. Since the metric (19) is smooth, it follows from standard results in Rie- 
mannian geometry that there exists a neighbourhood C Sym(n)"'" of / such that any 
element in is connected to / by a unique minimal geodesic. Therefore, if ttj{A) is close 
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enough to I, then A has a unique QR factorisation. Also, the QR factorisation of any 
matrix is well known to exist, and is unique if the matrix is invertible, which suggests the 
existence of minimal geodesies. Details concerning these questions are not investigated 
in this paper. 

In summary, we see that the factor R solves the problem of optimally (with respect 
to the cost function dist^j) transporting the Euclidean inner product on M" to the inner 
product defined by M = A^ A. Furthermore, the factor R is the transpose of the Cholesky 
factorisation of M. Indeed, if L = R'^ then M = ttj^A) = tti{R) = R^ R = LL^ . 

Usually, the QR factorisation is obtain by direct linear algebraic manipulations. An- 
other way to compute the R component is to solve the geodesic equation on Sym(n)"'", and 
lift that geodesic to a horizontal geodesic on GL(n). Although this is probably inefficient 
compared to existing algorithms (there are very fast algorithms based on Householder 
reflections), the geodesic approach might nevertheless provide some insights, for example 
in the case of sparse matrices. 

Remark 5.10. The setting can be extended to GL(n, C) by replacing SO(n) with U(n), 
and every transpose with the Hermitian conjugate. 
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